Assessing the Protection Provided by Misclassification-based Disclosure Limitation Methods for Survey Microdata
نویسندگان
چکیده
Government statistical agencies often apply statistical disclosure limitation techniques to survey microdata to protect the confidentiality of respondents. There is a need for valid and practical ways to assess the protection provided. This paper develops some simple methods for disclosure limitation techniques which perturb the values of categorical identifying variables. The methods are applied in numerical experiments based upon census data from the United Kingdom which are subject to two perturbation techniques: data swapping (random and targeted) and the post randomization method. Some simplifying approximations to the measure of risk are found to work well in capturing the impacts of these techniques. These approximations provide simple extensions of existing risk assessment methods based upon Poisson log-linear models. A numerical experiment is also undertaken to assess the impact of multivariate misclassification with an increasing number of identifying variables. It is found that the misclassification dominates the usual monotone increasing relationship between this number and risk so that the risk eventually declines, implying less sensitivity of risk to choice of identifying variables. The methods developed in this paper may also be used to obtain more realistic assessments of risk which take account of the kinds of measurement and other nonsampling errors commonly arising in surveys.
منابع مشابه
A CRONYM : Data without Boundaries D
Disclosure limitation methods for protecting the confidentiality ofrespondents in survey microdata often use perturbative techniques whichintroduce measurement error into the categorical identifying variables. Inaddition, the data itself will often have measurement errors commonly arisingfrom survey processes. There is a need for valid and practical ways to assess theprotect...
متن کاملPrivacy Protection from Sampling and Perturbation in Survey Microdata
Statistical agencies release microdata from social surveys as public-use files after applying statistical disclosure limitation (SDL) techniques. Disclosure risk is typically assessed in terms of identification risk, where it is supposed that small counts on cross-classified identifying key variables, i.e., a key, could be used to make an identification and confidential information may be learn...
متن کاملMicrodata Protection
Governmental, public, and private organizations are more and more frequently required to make data available for external release in a selective and secure fashion. Most data are today released in the form of microdata, reporting information on individual respondents. The protection of microdata against improper disclosure is therefore an issue that has become increasingly important and will co...
متن کاملData confidentiality: A review of methods for statistical disclosure limitation and methods for assessing privacy
There is an ever increasing demand from researchers for access to useful microdata files. However, there are also growing concerns regarding the privacy of the individuals contained in the microdata. Ideally, microdata could be released in such a way that a balance between usefulness of the data and privacy is struck. This paper presents a review of proposed methods of statistical disclosure co...
متن کاملUsing Noise for Disclosure Limitation of Establishment Tabular Data
We propose a new disclosure limitation method for establishment magnitude tabular data in which noise is added to the underlying microdata prior to tabulation. The proposed method has several advantages compared to the standard method of cell suppression: it enables some information to be provided within more cells of the table, it eliminates the need to coordinate cell suppression patterns bet...
متن کامل